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Abstract 

We present a layered packet video coding algorithm based on a progressive 
transmission scheme. The algorithm provides good compression and can handle 
si an if. cant packet loss with graceful degradation in the reconstruction sequence. 
Simulation results lor various conditions aic picscntcd. 


I. INTRODUCTION 

Due to the rapid evolution in the fields of image processing and networking, video information 
will be an important part of tomorrow’s telecommunication system. Up to now, video transmission 
has been mainly transported over circuit-switched networks. It is quite likely that packet-switched 
networks will dominate die communications world in die near future. Asynchronous transfer mode 
(ATM) techniques in broadband-lSDN can provide a flexible, independent and high performance 
environment for video communicauon. Therefore, it is necessary to develop techniques for ridco 
transmission over such networks. 

The classic approach in circuit switching is to provide a "dedicated padi," dius reserving 
a continuous bandwidth capacity in advance. Any unused bandwidth capacity on die allocated 
circuit is therefore wasted. Rapidly varying signals, like video signals, require too much 
bandwidth to be accommodated by a standard circuit-switching channel. With a certain amount 
of capacity assigned to a given source, if the output rate of that source is larger than the channel 
capacity, quality will be degraded. If die generating rate is less than the available capacity, 
the excess channel capacity is wasted. The use of packet networks allows for die utilization 
of channel sharing protocols between independent sources and can improve channel utilization. 
Another point that strongly favors packet-switched networks is die possibility that die integration 
of services in a network will be facilitated if all of the signals are separated into packets with 

the same format. 

Some coding schemes which support packet video have been explored. Verbicst and Pinnoo 
proposed a DPCM-based system which is comprised of an intrafield/interframe predictor, a 
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nonlinear quantizer, and a variable length coder[l]. Their codec obtains stable picture quality 
by switching between three different coding modes: intraficld DPCM, interframe DPCM, and no 
replenishment. Ghanbari has simulated a two-layer conditional replenishment codec with a first 
layer based on hybrid DCT-DPCM and second layer using DPCM[2]. This scheme generates 
two type of packets: "guaranteed packets" contain vital information and enhancement packets 
contain "add-on" information. Darragh and Baker presented a sub-band codec which attains 
a user-prescribed fidelity by allowing the encoder’s compression rate to 'vary [3]. The codec’s 
design is based on an algorithm that allocates distortion among the sub-bands to minimize 
channel entropy. Kishino et al. describe a layered coding technique using discrete cosine 
transform coding, which is suitable for packet loss compensation^]. Karlsson and Vctlcrh 
presented a sub-band coder using DPCM with a nonuniform quantizer followed by run-length 
coding for baseband and PCM with run-length coding for nonbaseband [5]. In this paper, a 
different coding scheme based on a progressive transmission scheme called Mixture Block 
Coding with Progressive Transmission (MBCPT) [6,7] is investigated. Unlike those methods 
mentioned above, MBCPT doesn’t use decimation and interpolation filters to separate the signals 
into sub-bands. However, it docs have the attractive property of dealing separately with high 
frequency and low frequency information. This separation is obtained by the use of vanablc 
blocksize transform coding. 


Tins paper is organized as follows. First, some of the important characteristics ^nd 
requirements of packet video arc discussed. In Section 3, die coding scheme called Mixture 
Block Coding with Progressive Transmission (MBCPT) is presented. In Section 4, a nctwoik 
simulator used in testing die scheme is introduced. In Section 5, the simulation results are 
discussed. Finally, in Section 6 die paper is summarized. 


IT. CHARACTERISTICS OF PACKET VIDEO 

The demand for various services, such as telemetry, terminal and computer connections, voice 
communications, and full-motion high- resolution video, along with tire wide range of bit rates and 
holding rimes they represent, provides an impetus for building a Broadband Integrated Service 
Digital Network (B-ISDN). B-ISDN is a projected worldwide public telecommunications network 
drat will service a wide range of user needs. The continuing advances in the technology of optical 
fiber transmission and integrated circuit fabrication have been driving forces to realize B-ISDN. 
The idea of B-ISDN is to build a complete end-to-end switched digital telecommunication network 
widi broadband channels. Still to be precisely defined by CC1TT, with fiber transmission, H4 
has an access rate of about 135 Mbps. 

Packet-switched networks have the unique characteristics of dynamic bandwidth allocation 
for transmission and switching resources, and the elimination of channel structure. They 
acquire and release bandwidth as needed. Because the video signals vary gready in bandwidth 
requirement, it is attractive to utilize a packet-switched network for video coded signals. Allowing 
the transmission rate to vary, video coding based on packet transmission permits the possibility 
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of keeping the picture quality constant, by implementing "bandwidth on demand". There arc 
three main merits when transmitting video packets over a packet-switched network. 

1. Improved and consistent image quality: if video signals are transmitted over fixed-rate 
circuits, there is a need to keep the coded bit rate constant, resulting in image degradation 
accompanying rapid motion. 

2. Multimedia integration: as mentioned above, integrated broadband services can be 
provided using unified protocols. 

3. Improved transmission efficiency: using variable bit-rate coding and channel sharing 
among multiple video sources, Scenes can be transmitted without distortion if other 
sources, at the same time, arc without rapid motion. 

However video transmission over packet networks also has tire following drawbacks. 


1. The time taken to transmit a packet of data may change from time to lime. 

2. Packets may be delayed to the point where, because of constraints due to the Human 
Visual System, they have to be discarded. 

3. Headers of packets may be changed because of errors and delivered to the wrong receiver. 

It has to be emphasized that the del ay, dost effect can reach very high levels if die combined 
users’ requirement exceeds the acquirable bandwidth and may seriously damage the quality of 


the image. 

When the signals transmitted in the network are nonstationary and circuit-switching is used 
with limited bandwidth, a buffer between the coder and ffie channel is needed to smooth out 
die varying rate. If die amount of data in the buffer exceeds a certain threshold, the encoder is 
instructed to switch into a coding mode that has lower rate but worse quality to avoid buffer 
overflow. In packet-switched networks, Asynchronous Time Division Multiplexing (ATDM) can 
efficiently absorb temporal variations of die bit-rate of individual sources by smoothing out die 
aggregate of several independent streams in die common network buffers[S]. 


To deliver packets in a limited time and provide a real time sendee is a difficult resource 
allocation and control problem, especially when die source generates a high and greatly varyin G 
rate. In packet-switched networks, packet losses are inevitable, but use of a packet-switched 
network yields a better utilization of channel capacity. However, it should be noted that die 
varying rate requirements of the video coder may not be synchronized with the vanations m 
available channel capacity which changes depending on the traffic in die network. Therefore, 
die interactions between the coder and the network have to be considered and be incorporated 
into the requirements for the coder. These requirements include. 

1. Adaptability of the coding scheme: The video source we are dealing with has a varying 
information rate. So it is expected that the encoder should generate different bit rates by 
removing die redundancy. When the video is still, there is no need to transmit anydiing. 

2. Insensitivity to error: The coding scheme has to be robust to the packet loss so that 
die quality of the image is never seriously damaged. Remember that retransmission is 
impossible because of the tight timing requirement. 


3 



3. ^synchronization of the video: Because the varying packet-generating rate and the lack 
of a common clock between the coder and the decoder, we have to find a way to 
reconstruct the received data which is synchronous to the display terminal. 

4. Control of coding rate: Sensing the heavy traffic in the network, the coding scheme is 
required to adjust the coding rate by itself. In the case of a congested network, the 
coder could be switched to another mode which generates fewer bits with a minimal 
degradation of image quality. 

5. Parallel architecture: The coder should preferably be implemented in parallel. That allows 
the coding procedure to be run at a lower rate in many parallel streams. 

In the next section, we investigate a coding scheme to see how well it satisfies the above 


III. MIXTURE BLOCK CODING WITH PROGRESSIVE 
TRANSMISSION 

Mixiurc Block Coding (MBC) is ;i variable-blocksize transform coding algorithm which 
codes the image with different blocksizcs depending upon the complexity of that block area. 
Low-Comdexiiv areas are coded with a large blocksizc transform coder while high-complexity 
icgions are coded witli small blocksizc. The complexity of the specific block is determined by 
the distortion between the coded and original image when die same number of bits are used to 
code each block. A more complex image block has higher distortion. The advantage of using 
MBC is that it does not process different complex regions with the same blocksizc. That means 
MBC has the ability to choose a finer or coarser coding scheme to deal with different complex 
parts of the same image. Witli the same rate, MBC is able to provide an image of higher quality 
than a ceding scheme" which codes different complex regions with the same blocksizc coder. 

Wh.cn using MBC, die image is divided into maximum blocksize blocks. After coding, the 
distortion between the reconstructed and original block is calculated. The block being processed 
is subdivided into smaller blocks if that distortion fails to meet die predetermined direshold. The 
coding-testing procedure conunues until die distortion is small enough or the smallest blocksize 
is reached. In this scheme, every block is coded until the reconstructed image is satisfactory 
and then moves to die next block. 

Mixture Block Coding widi progressive transmission (MBCPT) is a coding scheme which 
combines MBC and progressive coding. Progressive coding is an approach that allows an initial 
image to be transmitted at a lower bit rate which can later be updated[9]. In this way, successive 
approximations converge to the target image with the first approximauon carrying die "most" 
information and the following approximations enhancing it. The process is like focusing a lens, 
where the entire image is transformed from low-quality into high-quality. In progressive coding, 
every pixel value, or the information contained in it, is possibly coded more than once and the 
total bit rate may increase due to different coding scheme and quality desired. Because only the 
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gross features of an image are being coded and transmitted in the first pass, the processing time 
is greatly reduced for the first pass and a coarse version of the image can be displayed without 
significant delay. It has been shown that it is perceptually useful to get a crude image in a short 
time, rather than waiting a long time to get a clear complete image. 


With different stopping criterion, progressive coding is suitable for dynamic channel capacity 
allocation. If a predetermined distortion threshold is met, processing is stopped and no more 
refining action is needed. The threshold value can be adjusted according to the traffic condition in 
the channel. Successive approximations (or iterations) are sent through die channel in progressive 
coding and lead the receiver to the desired image. If these successive approximations arc marked 
with decreasing priority, then a sudden decrease in channel capacity may only cause the received 
image to suffer from quality degradation rather than total loss of parts of the images. 

MBCPT is a multipass scheme in which each pass deals with different blocksizcs. The 
first pass codes die image with maximum blocksizc and transmits it immediately. Only those 
blocks which fail to meet the distortion threshold go down to die second pass which processes 
die difference image block, coming from the original and coded image obtained in the first pass, 
with smaller blocks. The difference image coding scheme continues until the final pass which 
deals with the minimum size block. At die receiving end, a crude image is obtained from the 
first pass - a short lime and the data from following passes serve to enhance it. Fig. 1 shows 
die structure of pass consisting of 16x16 blocks for MBCPT. Fig. 2 shows the parallel structure 
of MBCPT. Coding algorithms using quad trees have also been proposed by Drcizen(10] and 
Vaisey and Gershofl 1]. In die quad tree coding structure of this paper, die 16x16 block is coded 
and the distortion of die block is calculated. If die distortion is greater than the predetermined 
threshold fer 16x16 blocks, the block is divided into four 8x8 blocks for additional coding, us 
coding-checking procedure is continued until die only image blocks not meeting die thresliolc 
ore those of size 2x2. Figure 3 shows lire til gontlim . 

Tiie block size used in die coding scheme should be small enough for ease of processing and 
storage requirements, but large enough to limit die inter-block redundancy[12]. Larger block size 
results in higher compression, but it is very' difficult to build real-time hardware for blocksizes 
larger than 16x16 because of die increase in the number of computations. So, 16x16 is chosen 
to be die largest blocksize. The minimum blocksizc determines die finest visual quality that is 
achievable in the busy area. If the minimum blocksizc is too large, it is possible to observe die 
blockincss in the coded edge of spherical objects because the coding block is square. In order to 
match die zonal transform coding used in diis paper, 2x2 is die smallest blocksize and there are 
four passes (16x16, 8x8, 4x4, 2x2) in diis scheme. Fig. 4-7 show images from the 4 passes. 


After applying the discrete cosine transform, only four coefficients, including the dc and 
diree lowest order frequency coefficients, are coded and others are set to zero. The dc coefficient 
in the first pass is coded with an 8-bit uniform quantizer due to the fact that it closely reflects 
the average gray level for that image block and is hard to model. The dc coefficient in the 
subsequent passes follows a laplacian model, and a 5-bit optimal laplacian nonumform quantizer 
is used to also follow a laplacian model with a variance greater than that of the dc coefficient 
and can therefore also be coded using a laplacian quantizer. As an alternative, an LBG vector 
quantizer with a 512 codebook size is used to quantize the vector which compnses the three ac 
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coefficients. The initial threshold of each pass is selected beforehand and is readjustable during 
the operation according to tire channel condition and quality required. 

Because only partial blocks which fail to meet tire distortion threshold need to be coded, 
side information is needed to instruct the receiver on how to reconstruct tire inrage. One bit 
of overhead is needed for each block. If a block is to be divided, a 1 is assigned to be its 
overhead; if not, a 0 is assigned. Tire example shown in Fig. 8 has Ore following overhead: 
1,1001,1001,1001,1001,1001. 


The interframe coder used in this paper is a differential scheme which is based on MBCPT. 
This coder processes tire difference image coming front tire current frame and the previous 
frame which is locally decoded from tire first three pass data. Fig. 9 shows tire algorithm 
of this coder. Fig. 10 shows a different scheme which does tire local decoding with all four 
passes. From Fig. 11, it can be seen Ur at when there is no packet loss, the performances of 
these two schemes arc quite fire same. But when congestion occurs in the network, with the 
priorities assigned to packets, packets from pass 4 are expected to be discarded first. In this 
case, the performance (from Fig. 12) of the scheme in Fig. 9 is much better than tire one in 
Fig. 10. Therefore tire coding scheme in Fig. 9 is used in our simulation, hr tins paper, the 
Kronkitc motion sequence from the USC database with 16 frames is used as the simulation 
source. Every image is 256x256 pixels with graylcvcls ranging from 0 to 255. It is similar to a 
video conferencing type image which has neither rapid motion nor scenes changes. Due to tins 
characteristic, advanced techniques like motion detection or motion compensation have net been 
used but could be implemented when broadcasting video. 

From the datastream output that is listed in Table 1, we can see that tire data in pass 4 
represents 30-40% of lire entire data. This part of die data is involved in increasing dre sharpness 
of the image and is usually labeled widr the lowest priority in network. We dtcrefore call this dre 
least significant pass(LSP). Widr a substanual possibility of being discarded due to low pnority, 
those packets from pass 4 won’t be used to reconstruct dre locally decoded image and be stored 
in the frame memory. This prevents dre packet loss error propagating into following frames if 
the lost packet belongs to pass 4. 


IV. SIMULATION NETWORK 

The network simulator used for diis study was a modified version of an existing simulator 
developed by Nelson ct al.[13], A brief description of the simulator is provided here. 


A. Introduction 

As mentioned in section 2, tomorrow’s integrated telecommunication network is a very 
complicated and dynamic structure. Its efficiency requires sophisticated monitoring and control 
algorithms with communication between nodes reflecting the existing capacity and reliability of 
system components. The scheme for communicating information regarding the operating status 
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is called die system protocol. Since the communication of system information must flow through 
the channel, it reduces tire overall capacity of the physical layers, but hopefully provides a more 
efficient system overall. Therefore, system efficiency depends entirely upon these protocols, 
which, in turn, depend upon die system topology, communication channel properties, nodal 
memory and component reliability. Most network protocols have been developed to provide high 
reliability in topological structures with reasonably high channel reliability. 

In order to fit into the purpose of tlais study, most modifications which were made to die 
simulator were in those modules concerning die network layer. Since the simulator is structured 
in modules which represent, to some degree, the ISO Model for packet switched networks, a 
more detailed description about the network layer modules follows. 


B. The Network Layer and Basic Operation 

The simulation of a layer at each node is represented by a “processor” and one or more 
“packet queues.” All events arc scheduled dirough the “Sim_Q” which drives the simulator. 
Initially, the processors are all idle, die packet queues are all empty and the only tasks scheduled 
arc die anivll of messages at the various nodes. The simulator operation occurs by examining 
t h c next event and perfonning die task indicated. The task may result in the scheduling of 
additional events, generally referred to as task complcdon urnes. When a message or packet is 
placed in die input queue at a node for a given layer, the processor for that queue is marked 
as busy, the packet is removed from die queue, and the task to be perfonned by die processor 
is scheduled for complcdon. When die task is completed (as a result of thc simulator reaching 
mat point in time), die “processor” examines die queue. If die queue is empty, the processor 
is set idle; otherwise it removes die next message or packet from die queue and schedules die 
completion of die operauon which must be performed. The layers in die simulator arc quite 
close in operauon to thc ISO transport, netwoik and datalink layers. 


(1) The Session Layer 

In die OSI model, thc session layer (SL) allows users to establish “sessions" on local or 
remote systems. In die simulator, as mentioned above, it contains a relatively simple model 
of thc subscribers, participates in flow-control, and acts as a statistics collector for messages 
arriving and delivered. At message arrival time (from Sim_Q), the session layer generates die 
“message” with all of its randomly selected attributes and if flow control or node hold-down 
are not in effect, submits it to the transport layer. It then schedules the next message arrival 
time. During initialization, the task “SL_Rcv_Msg” for each node is queued in Sim_Q for thc 
arrival time of the first message at that node. When this task is executed by the simulator, a 
message packet is generated and placed in the transport queue. The arrival of the next message 
is then queued in Sim_Q with the same task and with an arrival time determined by the random 
number generator (Poisson Distributed). The only other task perfonned by the session layer is 
die “SL_Snd_Msg” task that simulates delivery of messages to the subscribers, develops message 
statistics and “cleans up” the queues for messages delivered. 
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(2) The Transport Layer 

The basic function of the transport layer at the sending end is to receive the message from Ore 
session layer, place it in packets and pass the packets on to the network layer. At the receiving 
end, the packets arc reassembled into a message for delivery to the session layer. To accomplish 
the complex task of assuring reliable delivery, there is a transport time-out mechanism at both 
the sending and receiving nodes and a message acknowledgement packet that is sent to tire 
sending node when all packets for tire message have been satisfactorily received. At tire sending 
end, if a message acknowledgment is not received in the allotted time period, tire message can 
be retransmitted. In tire simulations reported in this paper, the retransmission feature was not 
used. At tire receiving end, if all packets are not received in tire specified period of time, the 
entire message is discarded. It is recognized that in some networks, packetization takes place 
at the network level, leaving the transport layer responsible only for message-level stmcturcs. 
Reassembly, depending upon the protocol, can take place as low as the datalink level. These 
tasks were both placed in the transport layer, but arc modular, and could be extracted and 
placed elsewhere. Also, tire simulator was originally designed for datagram service, and since 
the packets do not necessarily arrive in order, it is unlikely that assembly would take place 
at the datalink level. 


(3) The Network Layer 

The network layer is concerned with controlling the operation of the network. A key design 
issue is determining how packets arc routed from source to destination. Another issue is how to 
avoid the congestion caused when too many packets arc presented to the network at the same 
time. In he simulator, lire network layer performs all of he functions related to these two 
aspects with he exception of that aspect of flow control which takes place at tire session layer, 
and the recovery protocols which require some service from the datalink layer. It also activates 
new channels when needed and determines when packets originating at other nodes are to be 
discarded. 'lire network layer is currently he most dynamic with regard to he coding of modules. 
Five modules currently comprise he network layer. These include relatively static modules; one 
module for capturing lines or channels when more capacity is required and releasing hem when 
hey are not needed; one module for he network processor and queue handling and one module 
for Ore routines which arc common to most routing algorithms. This leaves two modules for he 
dynamic parts of the routing and flow control algorithms. 


(4) The Datalink Layer 

The main task of he datalink layer is to take a raw transmission facility and transform it 
into a line or channel hat appears free of transmission errors to he network layer. It simulates 
he sending of he message over he channel and he delivery at he other end. When a packet is 
received, he datalink acknowledgement is initiated either by he piggy-back acknowledgement 
or by generating a datalink acknowledgement packet. As mentioned previously, he datalink 
level also simulates he physical layer on a statistical basis. (Entered bit error rates are used in 
conjunction with a random number generator to determine if messages are corrupted.) When 
a line is "brought up", health packets are used to establish initial connections. Also, when a 
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line "goes down", an active node will immediately issue health check packets to ascertain when 
die channel is again available. 


C. Modifications 

A major problem of using this system as a simulation tool for die study of packet video 
is that as initially designed the system did not actually transmit messages from node to node. 
While a “packet” carrying all die necessary describing information moved from node to node, 
die re no actual data in the packet. Therefore, modifications had to be made to die simulator 
to accommodate die video data. In the sending node, a field called "Image" which contains 
real image data is attached to the record "Packet Jrtr" allocated to die message generated m 
die session layer. There arc dircc new modules in this layer. First, "Gctjmagc puts die 
image data into the image field of a message generated at a specific time and node. Second, 
"Imare_Availablc" checks to sec if dierc is any image data that still needs to be transmitted, 
that is tree, the following message, generated at dial specific node, is still die image message and 
contains some image data. Third, "Rcccivejmagc" collects die image data in the session layer 
of the receiving node when the flag "Imagc_Complete" is on. In module "Scssion_Msg_Amve , 
different priorities arc assigned to different messages. In module "Scssion_Msg_ Send", some 
statistics are calculated including die number of lost image packets and die transmission delay 
for image packets. 

In die original deisgn, die transport layer simply duplicated die same packet with different 
assumed sequential packet numbers without actually packctizing the message. The module 
"Transport J^acketizc" has been modified to really packctizc the image data which resides in the 
message record queued in "Transport_Q" when it is called. The module "Transport_Rcassemblc ' 
is called to reassemble these image packets according to dieir packet number when the flag 
"Ima<’c_Content" defined in "Packct_Ptr" is true. The network layer is responsible for routing 
and fiow-control This module was already very well developed, so the modifications to be 
performed here were relatively minor. In die datalink layer, in order to simulate the delivery' of 
packets through die channel, a new packet is generated at die receiving node and the information 
includin'* thelmage data from the transmitted packet (which will still be resident at the sending 
node) are copied into it. Using existing bit-eiror-rates, the transmission success rate can be set 
and bit errors can be inserted in both the data and control bits m the packet. Errors m the control 
bits are simulated separately as long as the error rates are consistent. If an error in control bits 
occurs, the transmission is assumed to fail and retransmission will occur, again depending on 
die threshold of the umeout number. In addition to the modifications made to the layer modules, 
we had to arrange some new memory elements allocated for image messages and packets In 
order to make sure the simulation is run in the steady state, the image data is made available 
to die network after some simulauon time has passed. 
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V. INTERACTION OF THE CODER AND THE NETWORK 

When the video data is packed and sent into a nonideal network, some problems emerge 
These arc discussed in the following section. 


A. Packetization 

The task of Ore packetizcr is to assemble video information, coding mode information, 
if it exists, and synchronization information into transmission cells. In order to prevent the 
propagation of tire error resulting from the packet loss, packets are made independent of each 
other and no data from the same block or same frame is separated into different packets c 
segmentation process in tire transport layer has no information regarding the video format, 
avoid the bit stream being cut randomly, the packetization process has to be intcgiatcd v.n 1 re 
encoder, which is in tire presentation layer of the user's premise. Otherwise, some overhead has 
to be added into the datastream to guide the transport layer to perform the packetization in the 
desired manner. In order to limit die delay of packetization, it is necessary to stmf the last cell 
of a packer video with dummy bits if the cell is not completely full. 

hverv packet must contain an absolute address which indicates the location of tire first block 
it carries' Because every block in MBCPT has the same number of bits in each pass, there is no 
need to indicate the relative address of the following blocks contained in the same packet. 1 here 
a! wavs exists a tradeoff between packaging efficiency and error resilience. If error tesiheno, 
; s consider- hie one packet should contain a smaller number of blocks. However, since cact 
channel access by a station contains overhead, the packet length should be large for transmission 
efficiency. Fixed length packetization is used in this paper for simplicity. 

of the structure of the coding scheme, die packets are classified into four priorities, 
with die packets from the first pass classified as the highest priority packets, and the packets 
from die fourth pass as die lowest priority packets. 

This nriority assignment also reflects the importance of the various packets to die 
reconstruction oAe inlage sequence at the receiver. Tab.e 1 shows the effect of app— * 
die same amount of packets lost in each pass on die reconstructed error in die received sequenc . 


B. Error Recovery 

There is no way to guarantee that packets won’t get lost after being sent into the network. 
Packet loss can be nrainly attributed to two problems. First. b,t errors can occur m * t addres 
field, leading the packets astray in fire network. Second congest, on c^ c«c=d the "etwote 
management ability and packets are forced to be diseased due to buffer overflow. Effec 
created by higher pass packet loss (like pass 4) in MBCPT codmg wrU he masked by U» I bas c 
passes and replaced with zeros. The distortion is almost mvis.ble when vtewtng at video rates 
because the lost area is scattered spatially and over time. However, low pass pac tos tea 0 
pass 1). though rare due to high priority, will create an erasure effect due to packeuzauon and 

die effect is very objectionable. 
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Considering die tight time constraint, retransmission is not feasible in packet video. It 
may also result in more severe congestion. Thus, error recovery has to be performed by die 
decoder alone. In our differential MBCPT scheme, die packets from pass 4 arc labeled lowest 
priority and form a great part of the complete data. These packets can be discarded whenever 
network congestion occurs. That will reduce the network congesuon and won’t cause too much 
degradation in quality. The erasures caused by basic pass loss arc simply covered with die 
reconstructed values from the corresponding area in die previous frame. This remedy seems 
insufficient even when diere is only small amount of motion in that area. Morion detection 
and morion compensation could be used lo find a best matched area for replacement in die 
previous frame. 


Side information in die MBCPT decoding scheme is very important. So, this vital information 
is not allowed to get lost. Two mcdiods can be used for protection. First, error control coding, 
like block codes or convolutional codes, can be applied in both directions along with and 
perpendicular to the packetization. The former is for bit error in die data field while die latter 
is for packet loss. The minimum distance diat die error control coding should provide depends 
on the network’s probability of packet loss, correlation of such loss and channel bit error rate. 
Second, from Table 2, we can sec that die output rate of side information and pass 1 and 
even .-.oss 2 is quite steady. It scents feasible to reserve a certain amount of channel capacity 
to these outputs to ensure their timely arrival. That means circuit-switching can be used for 
important and steady data. 


C. Flow Control 


In order to shield die viewer from severe network congestion, diere arc some flow control 
schemes which are considered useful. If diere is an interaction between the encoder and die 
transport laver, dicn the encoder can be informed about die network condidon. Depending on 
diat, the encoder can adjust its coding scheme. In die MBCPT coding scheme, if the buffer 
is getting full, diat means diat die bit generating rate is overwhelming the packcuzation rate 
and die encoder will switch to a coarse quantizer with fewer steps or loosens die threshold to 
decrease its output rate. In this way, smooth quality degradation is obtainable. However, tins 
also complicates the encoder design. 


It is possible to use the congestion control of the network protocols to prevent the drastic 
quality change by assigning different priorities to packets from different passes. Without 
identifying the importance of 'each packet and discarding packets blindly sometimes brings 
disaster and can cause a session shut down. For example, if the side information gets lost 
it can have a severe impact on the decoding process. In the MBCPT coding scheme, side 
information and packets from pass 1 are assigned highest priority and higher pass packets are 
assigned with decreasing priority. 
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D. Interaction with Protocols 


In the ISO model, physical, datalink and network layers comprise the lower layers which 
form a network node. The higher layers have transport, session, presentation and application 
layers and typically reside in a customer’s premises. The lower layers have to do nothing about 
the signal processing and only work as a 'packet pipe . The physical layer requires adequate 
capacity and low bit-error-rate which are determined only by technology. The datalink layer can 
only deal with link-management because all die mechanics, like requesting retransmission, is not 
feasible in packet video transmission. The network layer has to maintain orderly transmission by 
deleting die delay jitter with input buffering. Otherwise, it can take care the network congestion 
by assigning transmission priority. 


As die higher layers reside in die customer’s premises, it performs all die functions of the 
packet video coder, lire transport layer docs die packetization and reassembly. The packet 
length can be fixed or variable. Fixed packet lengdi simplifies segmentation and packet handling 
while a variable packet length can keep the packetization delay constant. The session layer 
supervises set-up and tear-down for sessions which have different types and quality. There is 
always a tradeoff between quality and cost. The quality of a set-up session can be determined 
bv die threshold in the coding scheme and the priority assignment for transmission. Of course, 
die better the quality, the higher the cost. Fig. 13 shows die tradeoff between PS NR and video 
output rate by adjusting thresholds. The presentation layer docs most of the signal processing, 
including separation and compression. Because it knows die video format exactly, if any error 
concealment is required, it will be performed here. The application layer works as a boun ary 
between die user and the network and deals with all die analog-digital signal conversion. 


VI. PERFORMANCE RESULTS 

Results obtained in this packet video simulation show that substantial compression can 
be obtained while maintaining high image quality dirough die use of this differential MBCP1 
scheme. The monochrome sequence used in this simulation contains 16 frames, each ot size 
256x256 pixels widi 8 bits per pixel, which results in a bit rate of 15.3 Mbits/s, given a vi eo 
rale of 30 frames/s. As Table 2 shows, the average data rates of our system is 1.539 Mbits/s. 
The compression rate is about 10 with a mean PSNR of 38.74 dB where PSNR is define as 

V (255)* 

PSNR = lOlog 

Fig. 14 shows the data rate of the sequence frames with side information, 4 passes and 
total rate. It is clear that the data rate of pass 1 is constant as long as the quantization mode 
remains die same. Side information and data from pass 2, even pass 3, is also. The ata rate o 
pass 4 is bursty and highly uncorrelated. As pass 4 data is not essential to the reconstruction 
of the image, the rate profiles as shown in Figure 14 and Table 1 suggest die use of a reserve 
channel of some sort for passes 1-3 and the side information, and a perhaps more unre la 
channel for pass 4 data which comprises more than 30% of the total traffic. Sue a situation 


12 



can be accommodated in a variety of systems such as a token ring network or a circuit switched 
network with a packet-switched overlay. 

Fig. 15 shows the PSNR for each frame in the sequence. Notice that the standard deviation 
of tire PSNR is only 0.2 dB, which implies a substantial uniformity of quality, at least in terms 
of objective perfomrance measures. If constancy with regards to some subjective entenon is 
desired, it would be necessary to incorporate this in tire determination of the thresholds and 
the decision mechanism for the quad tree. In tire simulation, the same threshold has been used 
throughout tire sequence. If further flexibility, say for higher visual quality is desired, a varying 
threshold can be used for different frames. That may generate a more variable bit rate. 


From tire difference images of this sequence, frames 1-8 seem quite motionless while frames 
9-13 contain substantial motion. We adjusted the traffic condition of the network to force some 
of the packets to get lost and thus check the robustness of tire coding scheme. Heavy traffic 
was set up in tire motionless and motion period separately. The average packet loss percentage 
was 3.3% which is considered high for most networks. Fig. 16 shows images which suffers 
packet losses from pass 4. As can be seen, the effect of lost packets is not at all severe, even if 
the lost packet rate is unrealistically high. This is because the performance from tire first thice 
passes is relatively good and the packet from tire fourth pass is not essential for reconstruction. 
Fig. 17 shows the case when packet loss occurs in pass 1. Clearly there are visible defects in 
the motion period. What’s worse, the error will propagate to the following frames. Apparently, 
the replenishing scheme used here is not sufficient in areas with motion. It is believed that 
this inconsistency can be eliminated with a motion compensator algorithm which would find the 
appropriate area for replenishment and error concealment which limits the propagation of error. 


VII. CONCLUSIONS 

The network simulator was used only as a channel in this simulation. In fact, before the real- 
time processor is built, a lot of statistics can be collected from the network simulator to improve 
upon the coding scheme. These include transmission delays and losses from various passes under 
different network loads. For rcsynchronization, the delay jitter between received packets can also 
be estimated from the simulation. The environment for tomorrow’s telecommunication has been 
described and requires a flexibility which is not possible in a circuit-switched network. With ah 
the requirements for applying packet video in mind, MBCPT has been investigated, t is oun 
that MBCPT has appealing properties, like high compression rate with good visual pe o nuance, 
robustness to packet lost, tractable integration with network mechanics and simplicity m parallel 
implementation. Some additional considerations have been proposed for the entire packet video 
system, like designing protocols, packetization, error recovery and ^synchronization.. For last 
moving scenes, the differential MBCPT scheme seems insufficient. Motion compensation, error 
concealment or even attaching function commands into the coding scheme are believed to be 
useful tools to improve the perfoiroance and will be the direction of future research. 
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